material synthesis
Benchmarking large language models for materials synthesis: the case of atomic layer deposition
Yanguas-Gil, Angel, Dearing, Matthew T., Elam, Jeffrey W., Jones, Jessica C., Kim, Sungjoon, Mohammad, Adnan, Nguyen, Chi Thang, Sengupta, Bratin
In this work we introduce an open-ended question benchmark, ALDbench, to evaluate the performance of large language models (LLMs) in materials synthesis, and in particular in the field of atomic layer deposition, a thin film growth technique used in energy applications and microelectronics. Our benchmark comprises questions with a level of difficulty ranging from graduate level to domain expert current with the state of the art in the field. Human experts reviewed the questions along the criteria of difficulty and specificity, and the model responses along four different criteria: overall quality, specificity, relevance, and accuracy. We ran this benchmark on an instance of OpenAI's GPT-4o. The responses from the model received a composite quality score of 3.7 on a 1 to 5 scale, consistent with a passing grade. However, 36% of the questions received at least one below average score. An in-depth analysis of the responses identified at least five instances of suspected hallucination. Finally, we observed statistically significant correlations between the difficulty of the question and the quality of the response, the difficulty of the question and the relevance of the response, and the specificity of the question and the accuracy of the response as graded by the human experts. This emphasizes the need to evaluate LLMs across multiple criteria beyond difficulty or accuracy.
- North America > United States > Illinois > Cook County > Lemont (0.04)
- North America > United States > Illinois > Cook County > Evanston (0.04)
Material synthesis through simulations guided by machine learning: a position paper
Syed, Usman, Cunico, Federico, Khan, Uzair, Radicchi, Eros, Setti, Francesco, Speghini, Adolfo, Marone, Paolo, Semenzin, Filiberto, Cristani, Marco
In this position paper, we propose an approach for sustainable data collection in the field of optimal mix design for marble sludge reuse. Marble sludge, a calcium-rich residual from stone-cutting processes, can be repurposed by mixing it with various ingredients. However, determining the optimal mix design is challenging due to the variability in sludge composition and the costly, time-consuming nature of experimental data collection. Also, we investigate the possibility of using machine learning models using meta-learning as an optimization tool to estimate the correct quantity of stone-cutting sludge to be used in aggregates to obtain a mix design with specific mechanical properties that can be used successfully in the building industry. Our approach offers two key advantages: (i) through simulations, a large dataset can be generated, saving time and money during the data collection phase, and (ii) Utilizing machine learning models, with performance enhancement through hyper-parameter optimization via meta-learning, to estimate optimal mix designs reducing the need for extensive manual experimentation, lowering costs, minimizing environmental impact, and accelerating the processing of quarry sludge. Our idea promises to streamline the marble sludge reuse process by leveraging collective data and advanced machine learning, promoting sustainability and efficiency in the stonecutting sector.
- Europe > Italy (0.29)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- Water & Waste Management > Solid Waste Management (1.00)
- Materials > Construction Materials (1.00)
- Construction & Engineering (0.95)
Text-to-Battery Recipe: A language modeling-based protocol for automatic battery recipe extraction and retrieval
Lee, Daeun, Choi, Jaewoong, Mizuseki, Hiroshi, Lee, Byungju
Recent studies have increasingly applied natural language processing (NLP) to automatically extract experimental research data from the extensive battery materials literature. Despite the complex process involved in battery manufacturing -- from material synthesis to cell assembly -- there has been no comprehensive study systematically organizing this information. In response, we propose a language modeling-based protocol, Text-to-Battery Recipe (T2BR), for the automatic extraction of end-to-end battery recipes, validated using a case study on batteries containing LiFePO4 cathode material. We report machine learning-based paper filtering models, screening 2,174 relevant papers from the keyword-based search results, and unsupervised topic models to identify 2,876 paragraphs related to cathode synthesis and 2,958 paragraphs related to cell assembly. Then, focusing on the two topics, two deep learning-based named entity recognition models are developed to extract a total of 30 entities -- including precursors, active materials, and synthesis methods -- achieving F1 scores of 88.18% and 94.61%. The accurate extraction of entities enables the systematic generation of 165 end-toend recipes of LiFePO4 batteries. Our protocol and results offer valuable insights into specific trends, such as associations between precursor materials and synthesis methods, or combinations between different precursor materials. We anticipate that our findings will serve as a foundational knowledge base for facilitating battery-recipe information retrieval. The proposed protocol will significantly accelerate the review of battery material literature and catalyze innovations in battery design and development.
- Asia > South Korea > Seoul > Seoul (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > Czechia > South Moravian Region > Brno (0.04)
- Asia > Middle East > Jordan (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.86)
- Energy > Energy Storage (1.00)
- Electrical Industrial Apparatus (1.00)
- Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.48)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Learning material synthesis-process-structure-property relationship by data fusion: Bayesian Coregionalization N-Dimensional Piecewise Function Learning
Kusne, A. Gilad, McDannald, Austin, DeCost, Brian
Autonomous materials research labs require the ability to combine and learn from diverse data streams. This is especially true for learning material synthesis-process-structure-property relationships, key to accelerating materials optimization and discovery as well as accelerating mechanistic understanding. We present the Synthesis-process-structure-property relAtionship coreGionalized lEarner (SAGE) algorithm. A fully Bayesian algorithm that uses multimodal coregionalization to merge knowledge across data sources to learn synthesis-process-structure-property relationships. SAGE outputs a probabilistic posterior for the relationships including the most likely relationships given the data.
- North America > United States > Maryland > Prince George's County > College Park (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Maryland > Montgomery County > Gaithersburg (0.04)
- Research Report (1.00)
- Overview (0.68)
- Instructional Material > Course Syllabus & Notes (0.60)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Autonomous synthesis of metastable materials
Ament, Sebastian, Amsler, Maximilian, Sutherland, Duncan R., Chang, Ming-Chiang, Guevarra, Dan, Connolly, Aine B., Gregoire, John M., Thompson, Michael O., Gomes, Carla P., van Dover, R. Bruce
Autonomous experimentation enabled by artificial intelligence (AI) offers a new paradigm for accelerating scientific discovery. Non-equilibrium materials synthesis is emblematic of complex, resource-intensive experimentation whose acceleration would be a watershed for materials discovery and development. The mapping of non-equilibrium synthesis phase diagrams has recently been accelerated via high throughput experimentation but still limits materials research because the parameter space is too vast to be exhaustively explored. We demonstrate accelerated synthesis and exploration of metastable materials through hierarchical autonomous experimentation governed by the Scientific Autonomous Reasoning Agent (SARA). SARA integrates robotic materials synthesis and characterization along with a hierarchy of AI methods that efficiently reveal the structure of processing phase diagrams. SARA designs lateral gradient laser spike annealing (lg-LSA) experiments for parallel materials synthesis and employs optical spectroscopy to rapidly identify phase transitions. Efficient exploration of the multi-dimensional parameter space is achieved with nested active learning (AL) cycles built upon advanced machine learning models that incorporate the underlying physics of the experiments as well as end-to-end uncertainty quantification. With this, and the coordination of AL at multiple scales, SARA embodies AI harnessing of complex scientific tasks. We demonstrate its performance by autonomously mapping synthesis phase boundaries for the Bi$_2$O$_3$ system, leading to orders-of-magnitude acceleration in establishment of a synthesis phase diagram that includes conditions for kinetically stabilizing $\delta$-Bi$_2$O$_3$ at room temperature, a critical development for electrochemical technologies such as solid oxide fuel cells.
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > California > Los Angeles County > Pasadena (0.04)
- (2 more...)